AMENDMENTS TO THE SPECIFICATION 

Page 1, after the title, please insert the following: 

-Related Applications 

This application is a divisional of U.S. Patent Application Serial No. 09/562,916 filed 
May 2, 2000 and entitled Construction of Trainable Semantic Vectors and Clustering, 
Classification, and Searching Using Trainable-Semantic Vectors, which claims priority from 
U.S. Provisional Patent Application No. 60/177,654, filed January 27, 2000, which is 
incorporated herein by reference.— 

Please amend page 17, line 7 through line 9 as follows: 

A weighted average of u and v can also be used to determine the significance of data 
points, according to the following formula: 

TSV ~ a(v) + (l-a)(tt) 

Please amend page 36, line 2 through line 16 as follows: 

Summation of the total number of columns 212 across each row 210 provides the total 
number of documents that contain the word represented by the row 210. These values are 
represented at column 216. Summation of all th e rows 210 across a column 212 provid e s th e 
numb e r of documents within the cat e gory r e pr e sent e d by that column 212. Thi s is shown in 
Figur e 8 using r e f e r e nce num e ral 218. Referring to Figure 8 word W\ appears twenty times in 
category Cat2 and eight times in category Cats. Word Wi does not appear in categories Catj, 
Cat3, and Cat4. Referring to column 216, word W\ appears a total of 28 times across all 
categories. In other words, twenty-eight of the documents classified contain word Wi . 
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Examination of a an exemplary column 212, such as Catj, reveals that word W2 appears once in 
category Catj, word W3 appears eight times in category Cati, and or W5 appears twice in 
category Catj . Word W4 does not appear at all in category Catj . As previously stated word W\ 
does not appear in category 1. Referring to row 218, the entry corresponding to category Cat] 
indicates that there are eleven documents classified in category Catj. 

Please amend page 36, line 17 through page 37, line 4 as follows: 

With continued reference to Figure 8, Figure 9 illustrates a table 230 that stores the values 
that indicate the relative strength of each word with respect to the categories. Specifically, the 
percentage of data points occurring in each category (i.e., u) is presented in the form of a vector 
for each word. The value for each entry in the u vector is calculated according to the following 
formula: 

u = Prob (entry | category) = (entry n , category m )/category m total 
Table 230 also presents the probability distribution of a data point's occurrence across all 
categories (i.e., v) in the form of a vector for each word. The value for each entry in the v vector 
is calculated according to the following formula: 

v = Prob (category | entry) = (eftfcy entrVn , category m )/entry n _ to tai 
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